salary

This sub-chapter shows an analysis of salary for different occupations in New York City, the values represent the salary of corresponding occupation per year.

## ─ Attaching packages ──────────────────── tidyverse 1.3.0 ─
## ✓ ggplot2 3.3.2     ✓ purrr   0.3.4
## ✓ tibble  3.0.4     ✓ dplyr   1.0.2
## ✓ tidyr   1.1.2     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.0
## ─ Conflicts ───────────────────── tidyverse_conflicts() ─
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
## Registered S3 method overwritten by 'mosaic':
##   method                           from   
##   fortify.SpatialPolygonsDataFrame ggplot2
## 
## The 'mosaic' package masks several functions from core packages in order to add 
## additional features.  The original behavior of these functions should not be affected by this.
## 
## Attaching package: 'mosaic'
## The following object is masked from 'package:Matrix':
## 
##     mean
## The following objects are masked from 'package:dplyr':
## 
##     count, do, tally
## The following object is masked from 'package:purrr':
## 
##     cross
## The following object is masked from 'package:ggplot2':
## 
##     stat
## The following objects are masked from 'package:stats':
## 
##     binom.test, cor, cor.test, cov, fivenum, IQR, median, prop.test,
##     quantile, sd, t.test, var
## The following objects are masked from 'package:base':
## 
##     max, mean, min, prod, range, sample, sum
## Loading required package: grid
## 
## Attaching package: 'vcd'
## The following object is masked from 'package:mosaic':
## 
##     mplot
## 
## Attaching package: 'RCurl'
## The following object is masked from 'package:tidyr':
## 
##     complete

Overview of Salary Distribution

In order to have an overview of salary distribution according to different occupations in New York City, we draw a Cleveland Dot Plot to show the 10-year-average salary of different occupations first. As can be seen from the result, there is a huge difference in salary for different kinds of occupations. The range is up to 72947, which is about 3 times of minimum salary.

Analyze on salary variation according to different years

## Top 3 and Last 3 Occupations in Salary

Top 3

  1. Legal occupations This occupation is in an monotonous increasing trend in salary, and the increasing speed is also becoming faster.
  2. Health diagnosing and treating practitioners and other technical occupations This occupation also has a monotonous increasing trend in salary.
  3. Computer and mathematical occupations The salary variation trend of this group is similar to the previous two groups.

Last 3

  1. Food preparation and serving related occupations This occupation has the lowest salary within the year range between 2010 and 2013. However, its salary is in an increasing trend by years.
  2. Farming, fishing, and forestry occupations The salary for this occupation decreased first and then increased. However, it did not reach the salary level to the year range from 2010 to 2013 even after increasing.
  3. Personal care and service occupations The salary trend for this occupation also decreased first and then increased. Different from the occupation of farming, fishing, and forestry occupations, the salary only dropped a little bit first and then increased a lot. Therefore, generally speaking, the salary of this occupation increased.

Variance Percentages

It is also very important to analyze on the variations of salaries of different occupations. Because different occupations have different base wages, sometimes it might be more meaningful to calculate the percentage of wage fluctuations in wages. Here, we use the average wages to represent the wage of different occupations.

## 
## ─ Column specification ────────────────────────────
## cols(
##   Occupations = col_character(),
##   variance = col_double(),
##   year = col_character(),
##   Salary_YearlyAvg = col_double()
## )

As we can see from the above plot, we can discover that the majority of these occupations have positive variances in the past decade. Only two of these categories have negative variances. Among all occupations, the occupation of Construction and extraction occupations has the biggest variation in salary from 2010 to 2019, and the occupation of Healthcare support occupations has smallest variation in salary from 2010 to 2019.

Boxplot

In order to see the salary variances of the 25 occupations in detail, we draw boxplotx to make comparisions.

Analyze on salary difference according to different counties

As can be seen in this plot, for different occupations, the counties with the highest and lowest wages in each occupation are different. For the majority of the occupations, the highest salaries are in New York County and their lowest salaries are in Bronx County. The detailed distribution is shown in the following table statistics.

General Distribution of the Highest and Lowest Wages in Different Counties

Boroughs With the lowest wage With the highest wage
Bronx County Legal occupations, Health diagnosing and treating practitioners and other technical occupations, Computer and mathematical occupations, Management occupations, Business and financial operations occupations, Architecture and engineering occupations, Life, physical, and social science occupations, Arts, design, entertainment, sports, and media occupations, Health technologists and technicians, Education, training, and library occupations, Installation, maintenance, and repair occupations, Community and social service occupations, Construction and extraction occupations, Office and administrative support occupations, Sales and related occupations, Transportation occupations, Fire fighting and prevention, and other protective service workers including supervisors, Production occupations, Building and grounds cleaning and maintenance occupations, Personal care and service occupations, Food preparation and serving related occupations
Kings County Law enforcement workers including supervisors
New York County Healthcare support occupations, Material moving occupations, Farming, fishing, and forestry occupations Legal occupations, Health diagnosing and treating practitioners and other technical occupations, Computer and mathematical occupations, Management occupations, Arts, design, entertainment, sports, and media occupations, Health technologists and technicians, Education, training, and library occupations, Community and social service occupations, Office and administrative support occupations, Sales and related occupations, Food preparation and serving related occupations
Queens County Farming, fishing, and forestry occupations
Richmond County Law enforcement workers including supervisors, Business and financial operations occupations, Architecture and engineering occupations, Life, physical, and social science occupations, Installation, maintenance, and repair occupations, Construction and extraction occupations, Transportation occupations, Fire fighting and prevention, and other protective service workers including supervisors, Production occupations, Building and grounds cleaning and maintenance occupations, Material moving occupations, Personal care and service occupations

As can be seen in the above table, we discover the following characteristics. 1) The majority of the highest salaries occur in New York County, many of them also appears in Richmond County, one of them appear in Queens County and none of them appear in Kings County or Bronx County. 2) Most of the lowest salaries appear in Bronx County. There are also several occupations with the lowest salaries in New York County and one occupation with its lowest salary in Kings County, which is the occupation of Law enforcement workers including supervisors. 3) The occupations with the lowest salaries in New York County tend to be the occupations with relatively low incomes.

Next, we draw a bar chart to reflect the specific distribution data of the highest and lowest wages in different counties (the average of all years).

## `summarise()` regrouping output by 'Boroughs' (override with `.groups` argument)

## Distribution of the Highest and Lowest Wages in Different Counties by Years To see whether there are changes of the distribution of highest and lowest wages in different countries, we draw a stacked bar chart by years. We use different colors to represent different counties.

## `summarise()` regrouping output by 'index', 'year' (override with `.groups` argument)

As can be seen in this plot, from the perspective of each year alone, the situation is slightly different from the overall average, which is reflected in the following aspects.

For the max salary

In the overall trend, maximum salary for all occupations do not lie in Bronx County and Kings County. However, as can be seen from the stacked bar chart, in year 2010, 2012, 2013 and 2014, there are some occupations with highest salary in Bronx County. Also, except for year 2014 and year 2017, there are some occupations with highest salary in Kings County. ### For the min salary In the overall trend, minimum salary for all occupations od not lie in Queens County and Richmond County. However, as can be seen from the stacked bar chart, except for year 2013, there are some occupations with lowest salary in Queens County. Also, except for year 2018, there are some occupations with lowest salary in Richmond County.

Variations in salaries between different counties

We also discover that the variations among different boroughs for different occupations are different. Therefore, we use a bar chart to order the degree of variance among different boroughs for all types of occupations. For each occupation, we use the salary in five counties to minus the smallest salary, add them up and divide the sum by 5. Then, we divide the value by the smallest salary to represent the variance of each occupation.

### Top 5 in Variation 1. Sales and related occupation 2. Legal occupations 3. Management occupations 4. Farming, fishing, and forestry occupations 5. Arts, design, entertainment, sports, and media occupations ### Last 5 in Variation 1. Personal care and service occupations 2. Health technologists and technicians 3. Community and social service occupations 4. Food preparation and serving related occupations 5. Life, physical, and social science occupations

Analyze on salary difference according to different genders

As can be seen in this Cleveland dot plot, the salaries of some occupations varies a lot between different genders, while some other occupations have similar salaries for two genders. Also, for some kinds of occupations, male have higher salaries and for other kinds of occupations, woman have higher salaries. To have a deeper understanding of these characteristics, we have a deeper analysis on salaries for different genders in different occupations. ## Salaries Variance between Genders We use a bar chart to order the salary variance between genders for different occupations. To quantify the difference, we divide the income difference between male and female by the average salary of the occupation. ## General Characteristics of Salaries Variance between Genders From the horizontal bar chart above, we discover the following characteristics. For most of the occupations, male employees have higher salaries than female employees. Female employees only have higher salaries in 4 kinds of occupations among the 25 kinds of occupations, namely, Construction and extraction occupations, Installation, maintenance, and repair occupations, Community and social service occupations, and Transportation occupations.

Top 5 and Last 5 in Variance

Top 5

  1. Sales and related occupations
  2. Building and grounds cleaning and maintenance occupations
  3. Material moving occupations
  4. Production occupations
  5. Personal care and service occupations

Last 5

  1. Transportation occupations
  2. Office and administrative support occupations
  3. Community and social service occupations
  4. Installation, maintenance, and repair occupations
  5. Computer and mathematical occupations

Relation between Size and Stability of Gender Salary Difference

We also want to analyze about the stability of the distinction between genders itself, to see if there is a relation between the size and the stability of the variance.

However, there’s no clear connections between the size and stability of variance.

Relation between Number of Employed Difference and Salary Difference in Gender

Intuitively, the gender composition of employees in a profession is related to the level of wages for changing gender. We want to analyze if this intuition makes sense. THerefore, we use two categorical variables to represent the two characteristics, namely “Gender Distribution” and “Salary Distribution”. For the category of “Gender Distribution”, there are two values, Male-dominated, which means there are more male employees in this occupation than female employees, and Female-dominated, which means there are more female employees in this occupation than male employees. For the category of “Salary Distribution”, we also set two values, Male-higher, which means male employees have higher salary in this occupation, and Female-higher, which means female employees have higher salary in this occupation. Then, we draw a mosaic plot to measure the relation. From this mosaic plot, we can see that salary distribution is related to gender composition. However, the characteristic of this connection is against tuition. We tend to think that in the “Female-higher” salary distribution group, there will be more female-dominated occupations, and in the “Male-higher” salary distribution group, there will be more male-dominated occupations. However, the conclusion from the plot is opposite against our tuition.